Prediction with Confidence Based on a Random Forest Classifier

نویسندگان

  • Dmitry Devetyarov
  • Ilia Nouretdinov
چکیده

Conformal predictors represent a new flexible framework that outputs region predictions with a guaranteed error rate. Efficiency of such predictions depends on the nonconformity measure that underlies the predictor. In this work we designed new nonconformity measures based on a random forest classifier. Experiments demonstrate that proposed conformal predictors are more efficient than current benchmarks on noisy mass spectrometry data (and at least as efficient on other type of data) while maintaining the property of validity: they output fewer multiple predictions, and the ratio of mistakes does not exceed the preset level. When forced to produce singleton predictions, the designed conformal predictors are at least as accurate as the benchmarks and sometimes significantly outperform them.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk

This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

Forest Stand Types Classification Using Tree-Based Algorithms and SPOT-HRG Data

Forest types mapping, is one of the most necessary elements in the forest management and silviculture treatments. Traditional methods such as field surveys are almost time-consuming and cost-intensive. Improvements in remote sensing data sources and classification –estimation methods are preparing new opportunities for obtaining more accurate forest biophysical attributes maps. This research co...

متن کامل

Bearing Capacity of Shallow Foundations on Cohesionless Soils: A Random Forest Based Approach

Determining the ultimate bearing capacity (UBC) is vital for design of shallow foundations. Recently, soft computing methods (i.e. artificial neural networks and support vector machines) have been used for this purpose. In this paper, Random Forest (RF) is utilized as a tree-based ensemble classifier for predicting the UBC of shallow foundations on cohesionless soils. The inputs of model are wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010